What is CQRS?
This concept comes from the object-oriented "Command and Query Separation (CQS, Command Query Separation)" from 1987 Betrand Meyer's Object-Oriented Software Construction (object-oriented software construction) book, the original concept is that we can Operations are divided into two forms: Command and Query.
The behaviors of these two operations are:
- Command: The state of the object will be changed after execution .
- Query: View the result of the object without changing the state of the object and without side effects on the object itself .
Similar concepts, we can apply under a decentralized architecture or microservice architecture, but we do not discuss the behavior of objects, instead of the behavior of services. In practice, this concept is mainly used to solve the efficiency of data operations, services The division of responsibilities and the problem of using "repetitiveness" in a decentralized system to improve the overall performance of the system.
CQRS Implementation of Database System
In the past when discussing service system design, because the database always encountered performance bottlenecks, it would try to use "read-write separation" to solve the problem, so this approach has long been used a lot. CQRS at the data storage level has always been the most intuitive and common application scenario, mainly used in database systems.
While most people use the CQRS mode, they are used on the database system for the first time. In addition to intuition, this is also the ultimate performance improvement method for the database in monolithic applications and three-tier architectures. .
Database Replication
The main implementation method of database replication is through the replication mechanism. Generally, each database system has its own solution, which is classified in the design category of the cluster. The main mode of operation depends on some lower-level system tools (mostly provided by the official or some third-party tools). If the data is found to be changed (added, modified, deleted) from the bottom of the database system, it will Data synchronization to other data storage areas.
Data Change Interception (CDC, Change Data Capture)
The tools that use the database system's native or system layer usually focus on the "physical data consistency" problem, only considering whether the data is correctly synchronized in different data storage areas, without considering the actual application requirements. Therefore, in recent years, there have also been some technologies that use data change interception (CDC, Change Data Capture) to achieve solutions, such as:
- Yelp — MySQL Streamer
- LinkedIn — Databus
- Zendesk — Maxwell
This type of CDC technology is mainly to scan or monitor database logs (Logs), and when data changes are found, data synchronization will start.
Application CQRS implementation
In the design of microservice architecture and decentralized systems, the application scope of CQRS is more extensive and variable, not only for data storage level processing, but also combined with the design of service application level, there are more diverse usage scenarios . In addition, when designing a service system architecture, in order to be able to expand the system and achieve a decentralized architecture, repeatability is often greater than reusability, and CQRS is an indispensable integration model.
The main requirement of the application level for CQRS is to solve the data needs in the "business" , such as transaction mechanisms and related queries of complex data. This kind of difficult to use database-level solutions to achieve, can only be achieved through business transformation Service.
Output events from CDC to drive distributed data processing
The method of CDC was originally used for data synchronization of the database system, but we can directly output CDC events to the service. When the state of data changes becomes eventualized, it is no longer just the internal replication mechanism of the database system, and the application scenarios will become more diverse, not limited to database synchronization, but also includes application-level integration to achieve event-driven (Event- driven) design. Moreover, when integrating the design of the microservice system architecture, you can use the "partial data synchronization" approach with service splitting to achieve the flexibility of a decentralized architecture and even improve access performance for specific needs.
This approach mixes the implementation of the database system level and the application service layer. For those who have already imported CDC, they can implement event-driven CQRS implementation without changing the application of the event originating source.
Event-driven distributed data processing
Using event-driven data flow to achieve data synchronization is similar to the database system's CDC approach. Whenever data changes, data synchronization is performed. But the difference is that the event is initiated by the "application", not by the CDC mechanism, and the event itself is usually directly related to the business . In other words, data synchronization is triggered by "domain events" and "not triggered by data changes."
In practice, the purpose of data flow between services is not simply to synchronize data one-to-one, but to trigger data processing or related task mechanisms in response to event requirements, or even ignore and discard unnecessary data. data.
Because the same data or a command can be triggered to generate multiple different events according to demand, therefore, using service-level event-driven to implement CQRS is usually more flexible than the database system.
This service-level implementation can be used in three scenarios:
- Solve the problem of data consistency and synchronize the database (same as CDC)
- Optimize the performance of relational queries
- Under the real-time data flow structure, realize the classification and processing of data at the same time
If you want to implement event-driven yourself, you can throw an event at the command activation source, so that all services interested in the event can take over subsequent tasks, such as storing data and classifying data. Usually this kind of implementation will be combined with Message Queuing System. If it is more rigorous and stable, Event Sourcing will be implemented to match Log-based queueing system, such as Kafka, NATS, etc.
There are quite a few issues brought about by CQRS. Among them, the technology involves database design and information queue. If you do more in-depth research, it will be closely related to the separation of microservices. This article only explains some practical methods of CQRS in practice.
In addition, CQRS is essentially space for time, using repetitiveness to reduce reusability and improve overall system performance. The disadvantages of decentralized design will of course make the system architecture look complicated and fragmented. The only thing that can connect the entire system is to rely on events, so event-driven is another topic that needs in-depth discussion. .